Expanded Hidden Markov Models: Allowing Symbol Emissions in State Changes
نویسندگان
چکیده
In this paper we formally expand hidden Markov models (HMM) by symbol emissions in state changes. These expanded hidden Markov models (eHMM) can contain more information than original HMM with the same number of states. This is a necessary step towards the definition of hidden non-Markovian models on the basis of discrete stochastic models. These are most of the time event driven, which makes it necessary to attach information to the state changes that represent the events. The paper shows that the extended paradigm is to some extent equivalent to original HMM, and gives an example of the new possibilities using hidden non-Markovian models. MOTIVATION Simulation most of the time involves the need to model a real or planned system and then predict its behavior to some extent. This process heavily depends on structural and statistical knowledge about the system’s behavior and parameters. For various reasons, sometimes one is not able or not willing to observe the complete system internals, but rather focuses on the system’s visible interaction with the environment. This could be the case, when one wants to diagnose the engine of a car without taking it apart, but by solely looking at some protocol that can be read out electronically. This error protocol can also be viewed as a sequence of signals that the hidden system (engine internals) emits. In case of a satellite in orbit it might not even be feasible to examine the object itself, but one has to rely on the signals emitted by the satellite in order to diagnose it, since the cost of sending a technician into space is just too high. Many real systems can be represented using discrete stochastic models such as Petri nets. Extending these by the modeling and analysis of so-called rewards is a way to include observable output in these models. Petri nets are event-driven; the state transitions are the active elements and they model the dynamic elements of the real process. So-called impulse rewards can be associated with state transitions, and can model costs, protocol signals and much more. However, the analysis methods for Petri nets do not account for hidden information and the system is always assumed to be observable as a whole. Hidden Markov models (HMM) on the other hand can model hidden processes with observable outputs. HMM consider the symbol being emitted in certain state, regardless of the previous or following state. However, the hidden model of HMM is usually a discrete-time Markov chain (DTMC). This restricts the modeling capabilities of the paradigm to geometric state duration distributions in the discrete case. Combining the advantages of HMM and Petri nets, it would be possible to model more general hidden models which are event driven. Hidden non-Markovian models that can model more realistic behavior than DTMCs, could be analyzed with the methods of HMM, which expands the range of possible application areas. The first step is to make HMM event-driven, which means associating the emission of symbols with a state transition, rather than with a state. The second step is then based on these eHMM the formalization of hidden non-Markovian models, which will not be described here. In this paper we formally define expanded hidden Markov models (eHMM) by associating symbol emissions with the state changes of HMM. We also adapt the three major algorithms that are used for the different analysis procedures, namely the forward algorithm for Evaluation, Viterbi for Decoding and Baum-Welch for Training of eHMM. The paper also gives an example of a hidden non-Markovian model to show the possible applications of the new paradigm. STATE OF THE ART Hidden Markov models and their application in speech recognition were first published around 1970; one of the first papers is (Baum et al. 1970). A comprehensive summary of the theory including algorithms and application examples can be found in (Rabiner 1989). The notation used in this current paper is based on the notation in (Rabiner 1989). In order to make the models more flexible, some research was done on explicit state duration densities (hidden semiMarkov models), which however complicated the solution algorithms (Russel and Moore 1985). Expanding all HMM states to a sub-HMM was also used to realize more general state duration distributions (expanded state HMM) (Russel and Cook 1987). This increased the number of free parameters, the sub-HMM topologies were not very flexible, and the performance was tuned to speech-recognition systems. In (Wickborn et al. 2006) a method is described to find the trace probability and the corresponding state sequence for continuous stochastic models with generally distributed transitions, but the method is not applicable for the training of models. In (Isensee et al. 2006) a method for the training of hidden non-Markovian models using phase-type distributions was introduced. Hidden Markov Model Background Classical HMM are discrete-time Markov chains (DTMC) that stochastically emit a symbol at every time step, depending on the state that the hidden process currently resides in. These are so-called double stochastic processes and are sometimes also called signal models. They are widely used in speech recognition systems and sometimes in pattern recognition. (Rabiner 1989) A hidden Markov model can be described by a 5tuple (S,V,A,B,Π). S is the set of states of the DTMC. V is the set of output symbols. A is the transition probability matrix of the DTMC. B is the output probability matrix, with bij being the probability to output symbol vj in state si. Π is the initial probability vector of the DTMC. A sequence of states of the DTMC is denoted as Q={q1,q2,q3,..., qT} and an observed output sequence as O={o1,o2,o3,...,oT} with T being the maximum number of steps of the DTMC. A parameterization of a specific HMM is often denoted by λ=(A,B, Π) which fully defines the model. There are three basic questions that can be answered for an HMM and a given output sequence (trace). 1. What is the probability of producing the given output sequence O with the given model λ? 2. What is the most probable state sequence Q of the model λ that produced the observed output sequence O? 3. Maximize the probability with which the model λ produces the output sequence O by training the parameters of the model. The first question can be efficiently answered by the forward algorithm (Rabiner 1989). The most likely state sequence can be determined by the Viterbi algorithm (Viterbi 1967). The third task of training an HMM model is solved by the so-called Baum-Welch algorithm (Baum et al. 1970). eHMM – DEFINITION AND NOTATION First the basic definition of eHMM is adapted from HMM. Most of the components stay the same. The changes are that the state sequence is extended by one state and starts at time t=0. This makes it also one element longer than the symbol trace. The most important change can be found in the emission probabilities B. These are no longer attached to a specific state, but to a transition between two states. ) ( } { ) | ( ) ( 1 )} ( { ) ( : } | ( } { : } { : } { : } , { : } , { : ) , , ( : ) , , , , ( : 5
منابع مشابه
An Adaptive Approach to Increase Accuracy of Forward Algorithm for Solving Evaluation Problems on Unstable Statistical Data Set
Nowadays, Hidden Markov models are extensively utilized for modeling stochastic processes. These models help researchers establish and implement the desired theoretical foundations using Markov algorithms such as Forward one. however, Using Stability hypothesis and the mean statistic for determining the values of Markov functions on unstable statistical data set has led to a significant reducti...
متن کاملProfile Modeling with Hidden Markov Models Summary
The “fundamental” states N in a Hidden Markov Model are (especially in bioinformatics) usually called match states. Sequences formed with symbols of a finite alphabet E are the observations on which the HMM will be based. Two different stochastic processes operate on the state set: one process moves from one state to another, and the second process controls the emission of symbols of E with pro...
متن کاملAn introduction to Hidden Markov Models
Hidden Markov Models (HMM) are commonly defined as stochastic finite state machines. Formally a HMM can be described as a 5-tuple Ω = (Φ,Σ, π, δ, λ). The states Φ, in contrast to regular Markov Models, are hidden, meaning they can not be directly observed. Transitions between states are annotated with probabilities δ, which indicate the chance that a certain state change might occur. These prob...
متن کاملIntroducing Busy Customer Portfolio Using Hidden Markov Model
Due to the effective role of Markov models in customer relationship management (CRM), there is a lack of comprehensive literature review which contains all related literatures. In this paper the focus is on academic databases to find all the articles that had been published in 2011 and earlier. One hundred articles were identified and reviewed to find direct relevance for applying Markov models...
متن کاملArtifacts from Combining Hidden Markov Models
Hidden Markov models (HMMs) have found wide spread use in bioinformatics. In short, an HMM consists of a set of states; each state has a probability distribution over what state to move to next when being in this state (often an HMM is drawn as a directed graph with nodes representing states and edges showing transitions from one state to another with non-zero probability, cf. e.g. Figure 1) an...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007